Week 06: Cool Type Checking & Runtime Organization

Cool Type Checking

Static vs. Dynamic Typing

  • Static type systems detect common errors at compile time.
  • But some correct programs are disallowed.
    • Some argue for dynamic type checking instead.
    • Others want more expressive static type checking.
  • Def.
    • The dynamic type of an object is the class C that is used in the new C expression that created it at run-time.
    • The static type of an expression captures all dynamic types the expression could have at compile-time.

Soundness Theorem

  • Target
    • For all expressions $E$, dynamic_type(E) = static_type(E)
    • Sample
      • class B inherits A;
      • x:A <- new B;
      • Static type of x: x:A
      • Dynamic type of x: new B
  • Soundness theorem for the Cool type system: $$\forall E,\ dynamic\_type(E) \leq static\_type(E)$$
  • Subclasses only add attributes or methods
  • Methods can be redefined but with same type!

Self Type

Scenario (Why we need to use SELF_TYPE)

class Count {
    i : int <- 0;
    inc () : Count { {
        i <- i + 1;
        self;
      }};
};
class Stock inherits Count { name : String; };
class Main { Stock a <- (new Stock).inc(); ... a.name ... };
  • Because inc() will return Count.
    • Use inc():SELF_TYPE instead.
  • The type checker can now prove
    • $O,M,C\vdash (new\ Count).inc():Count$
    • $O,M,C\vdash (new\ Stock).inc():Stock$
  • Remark: SELF_TYPE is not a dynamic type. It is a static type. In cgen, it will fetch the class tag of the instance.

In general, if SELF_TYPE appears textually in the class $C$ as the declared type of $E$ then $dynamic\_type(E) \leq C$

The meaning of SELF_TYPE depends on where it appears

  • $SELF\_TYPE_C$ to refers to an occurrence of $SELF\_TYPE$ in the body of $C$
  • $SELF\_TYPE_C \leq C$

Type Operation

  • $T_1 \leq T_2$: $T_1$ is a subtype of $T_2$
  • $lub(T_1, T_2)$: the least-upper bound of $T_1$ and $T_2$

Let $T$ and $T’$ be any types but SELF_TYPE. Theorems between two types:

  1. $SELF\_TYPE_C \leq SELF\_TYPE_C$
    • $lub(SELF\_TYPE_C, SELF\_TYPE_C) = SELF\_TYPE_C$
  2. $SELF\_TYPE_C \leq T$, if $C\leq T$
    • $SELF\_TYPE_C$ can be any subtype of $C$
    • Include $C$ itself.
    • $lub(SELF\_TYPE_C, T) = lub(C, T)$
  3. $T \leq SELF\_TYPE_C$ always false.
    • Special case: $SELF\_TYPE_C$ is a path and $T$ is the leaf of the path.
    • $lub(SELF\_TYPE_C, T) = lub(C, T)$
  4. $T \leq T'$
    • $lub(T, T’)$

Self Type Usage

SELF_TYPE is not allowed everywhere a type can appear.

  1. class T inherits T’ {...}: T and T' cannot be SELF_TYPE
  2. x:T: T can be SELF_TYPE
  3. let x: T in E: T can be SELF_TYPE
  4. new T: T can be SELF_TYPE
  5. m@T(E_1, ..., E_n): T cannot be SELF_TYPE
  6. m(x : T) : T’ { ... }: Only T’ can be SELF_TYPE

Self Type Checking

The meaning of SELF_TYPE depends on the enclosing class $C$ $(O,M,C\vdash e:T)$.

We use lub to design type rules using SELF_TYPE

  • [Dispatch] $$\frac{O,M,C\vdash e_0:T_0\\ O,M,C\vdash e_1:T_1\\ \dots\\ O,M,C\vdash e_n:T_n\\ M(T_0, f)=(T'_{1},\dots,T'_{n'},T'_{n+1})\\ T'_{n+1} \neq SELF\_TYPE\\ T_i\leq T'_{i},\ for\ 1\leq i\leq n}{O,M,C\vdash e_0.f(e_1,\dots,e_n):T'_{n+1}}$$
  • [Static Dispatch] $$\frac{O,M,C\vdash e_0:T_0\\ O,M,C\vdash e_1:T_1\\ \dots\\ O,M,C\vdash e_n:T_n\\ M(T_0, f)=(T'_{1},\dots,T'_{n'},T'_{n+1})\\ T'_{n+1} \neq SELF\_TYPE\\ T_i\leq T'_{i},\ for\ 1\leq i\leq n}{O,M,C\vdash e_0@T.f(e_1,\dots,e_n):T'_{n+1}}$$

New rules using SELF_TYPE

  • $$\frac{}{O,M,C\vdash self: SELF\_TYPE_C} $$
  • $$\frac{}{O,M,C\vdash new\ SELF\_TYPE: SELF\_TYPE_C} $$

Error Recovery in Semantic

Problem

  • What type is assigned to an expression with no legitimate type?
  • This type will influence the typing of the enclosing expression.

Solution

  • Assign type Object to ill-typed expressions.
  • Or assign a new type No_type for use with ill-typed expressions
    • No_type $\leq$ C, for all types $C$
    • Every operation is defined for No_type
    • The class hierarchy is not a tree anymore.
  • The Object solution is fine in the course project.

Runtime Organization

Introduction

Goal

  • Management of Run-time Resources
  • Correspondence between static (compile-time) and dynamic (run-time) structures
  • Storage organization

When a program is invoked

  • The OS allocates space for the program
  • The code is loaded into part of the space
  • The OS jumps to the entry point (i.e., “main”)

Activations

Goal

  • Correctness of the program
  • Speed of the program

Assumption

  • Execution is sequential; control moves from one point in a program to another in a well-defined order.
  • When a procedure is called, control always returns to the point immediately after the call.

Activation

  • An invocation of procedure $P$ is an activation of $P$.
  • The lifetime of an activation of $P$ is
    • (Run-time) All the steps to execute $P$
    • (Nested) Including all the steps in procedures $P$ calls
  • The lifetime of a variable x is the portion of execution in which x is defined.
    • Lifetime is a dynamic (run-time) concept.
    • Scope is a static concept.
  • Lifetimes of procedure activations are properly nested.
    • Use a stack to track currently active procedures.

Activation Records

The information needed to manage one procedure activation is called an activation record (AR) or frame.

Scenario

A procedure $F$ calls $G$, then $G$'s activation record contains a mix of info about $F$ and $G$. $F$ is suspended until $G$ completes, at which point $F$ resumes.

$G$’s AR contains

  • Complete execution of $G$
  • Resume execution of $F$
  • Space for $G$'s return value
  • Actual parameters
  • Pointer to the previous activation record (control link: points to AR of caller of $G$)
  • Machine status prior to calling $G$
    • Contents of registers & program counter
    • Local variables
  • Other temporary values

AR for function f

  • (Low)
  • Code
  • Static Data
  • Main
  • f: result
  • f: argument
  • f: control link
  • f: return address
  • Empty
  • Heap
  • (High)

Globals & Heaps

Globals are assigned a fixed address once.

Memory Management

  • Code Segment: Object code
    • For many languages, fixed size and read only
  • Static: data (not code) with fixed addresses (e.g., global data)
    • Fixed size, may be readable or writable
  • Stack: an AR for each currently active procedure
    • Each AR usually fixed size, contains locals
  • Heap: all other data
    • In C, heap is managed by malloc and free.

Alignment

Data is word aligned if it begins at a word boundary.

To align the 5-ch string, add 3 padding characters to the string.

Stack Machines

Goal

Use the stack to handle operating.

An instruction $r = F(a_1,\dots,a_n)$

  • Pops n operands from the stack
  • Computes the operation $F$ using the operands
  • Pushes the result r on the stack

Location of the operands/result always on the top of the stack.

n-register Stack Machine: keep the top n locations of the pure stack machine's stack in registers

1-register stack machine: The register is called the accumulator

  • Add: acc = acc + top_of_stack

Consider an expression $op(e_1,...,e_n)$ ($e_1,...,e_n$ are subexpressions)

  • For each $e_i\ (0 < i < n)$
    • Compute $e_i$
    • Push result on the stack
  • pop n-1 values from the stack, compute op
  • Store result in the accumulator

In [ ]: